-
Notifications
You must be signed in to change notification settings - Fork 589
Add limits to the size of the string repetition multiplier #23561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
Historically, given a statement like `my $x = "A" x SOMECONSTANT;`, no examination of the size of the multiplier (`SOMECONSTANT` in this example) was done at compile time. Depending upon the constant folding behaviour, this might mean: * The buffer allocation needed at runtime could be clearly bigger than the system can support, but Perl would happily compile the statement and let the author find this out at runtime. * Constants resulting from folding could be very large and the memory taken up undesirable, especially in cases where the constant resides in cold code. This commit adds some compile time checking such that: * A string size beyond or close to the likely limit of support triggers a fatal error. * Strings above a certain static size do not get constant folded.
953cf3f
to
3725af9
Compare
One of the mottos of perl, at least when I picked up the book, was "no internal limits". Just to get them out of the way, I'm going to front load these:
More constructively:
|
Thanks for the feedback, @guest20. I can see that the discussion might have to balance what people could conceivably do with improving the guardrails around the patterns of usage we can observe on CPAN / other public code repositories.
More than
Yes, the user doing something funky with magic is definitely worth considering. This PR is checking for a right operand that is Perhaps the code should check for a |
Care to share details of the platform you're running on to help me understand use cases better?
Ah, not magic in the Last time I grepped CPAN for this, there was definitely usage where it seemed like people would expect the whole string - e.g. for preparing a buffer or some kind of initialization - and not some iterator behaviour. Maybe that would be more of a feature request for a separate operator?
That's what i use my RAM for. |
What would you be looking for to resolve those tickets or declare them "wontfix"? Both of those tickets are about constant folding producing huge strings, possibly in rarely-taken or even never-taken branches, and that memory use being undesirable. The options suggested there seemed to be:
|
(I've pushed a commit changing the DIE to a warning, in case that's helpful to the discussion.) |
033a494
to
fdc3bd8
Compare
For discussions on Perl#23561. perl -e 'use warnings; my $x = ($_) ? "A" x (2**62) : "Z"' gives this on blead for me: ``` Out of memory! panic: fold_constants JMPENV_PUSH returned 2 at -e line 1. ``` on the previous commit, it would die: ``` Unrealistically large string repetition value" ``` With this commit, it just warns: ``` Unrealistically large string repetition value at -e line 1. ``` but will blow up if the repetition OP does get executed: ``` Out of memory in perl:util:safesysrealloc ```
fdc3bd8
to
8fa7f8e
Compare
Well, to maintain back-compat it'd have to be full of tie-magic so the caller got the corresponding face full of bytes when they |
#20586 was already addressed by #20595. I think the OP of #13793 is addressed generally by copy on write as modified by #20595. This change does address the point @iabyn makes in #20586 (comment) where constant folding of the repetition operator can result in large SVs that last the lifetime of the program, even if they're unused (something I've had to workaround occasionally). The original error from #13324 has been addressed by integer overflow checks added over the years:
and by the Out of memory you get when trying to allocate beyond the memory available:
though the error reporting could probably be better. |
CW: my knowledge of guts is all from rumour and hearsay (possibly also from heresy) @tonycoz What issue does the thing being in ... constant folded space... cause that the same large string being in regular-old scalar wouldn't cause? |
I still like the idea of at least warning at compile time that the repetition is likely to be fatal. |
} | ||
} else { | ||
NV rhs = 0.0; rhs = SvNV_nomg(constsv); | ||
if (rhs >= (NV)((SIZE_MAX >> 2) +1) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a -1.0 for safety? It can really questionable what the "53rd digit" to the right side of the .
is, and what CC, what CPU, what OS, which security or spectre patch for your OS and CC, and make month and year of all 4 please.
I don't trust C's float/double keyword's rounding modes at all, and were constant folding intermediate values done in FP CPU real instructions or C abstract machine instructions, calculations done at 32, 64, or 80 bit or 128 bit intermediate floating point precision?
the goose has been cooked at malloc(2 GB) or malloc(2GB-1 byte) either way. thats not a future bug ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also don't forget that Intel 64/AMD 64 in 64 bit mode CPUs are incapable of doing 80 bit floating pointer intermediate math unlike 32 bit mode. So >= 2^53 or >= 2^52 starts introducing more and more "error" or rounding into the math formula, and we have a 64 bit memory space on paper (more like 48 bits unless your a rack server of brand new Xeons, which I think finally took another chomp at the AMD64 ISA's central address space gap).
@@ -193,6 +193,14 @@ fresh_perl_like( | |||
eval q{() = (() or ((0) x 0)); 1}; | |||
is($@, "", "RT #130247"); | |||
|
|||
# [GH #13324] Perl croaks if a string repetition seems unsupportable | |||
fresh_perl_like( | |||
'use warnings; my $x = "A" x (2**99)', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we try some maximum toxic 0.0 NV literals here? maybe creating them with PP pack and or unpack.
Historically, given a statement like
my $x = "A" x SOMECONSTANT;
, no examination of the size of the multiplier (SOMECONSTANT
in this example) was done at compile time. Depending upon the constant folding behaviour, this might mean:This commit adds some compile time checking such that:
Things could obviously still go bad at runtime when the multiplier isn't a simple
constant, but that's not for this PR.
Closes #13324, closes #13793, closes #20586.
Besides general correctness checking, the arbitrary cut-off numbers are up for discussion, and please could reviewers suggest any improvements to the_perldiag.pod_ that come to mind.